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Abstract 

We consider a binary erasure version of the n-channel multiple descriptions problem with symmetric 
descriptions, i.e., the rates of the n descriptions are the same and the distortion constraint depends only 
on the number of messages received. We consider the case where there is no excess rate for every k out 
of n descriptions, i.e., any subset of k messages has a total rate of R{Dk) = 1 — Dk, where ) is the 
Shannon rate-distortion function and is the distortion constraint when k descriptions are received 
at the decoder Our goal is to characterize the achievable distortions Di, D2, ■ ■ ■ , Dn- We measure the 
fidelity of reconstruction using two distortion criteria: an average-case distortion criterion, under which 
distortion is measured by taking the average of the per-letter distortion over all source sequences, and a 
worst-case distortion criterion, under which distortion is measured by taking the maximum of the per- 
letter distortion over all source sequences. We present achievability schemes, based on random binning 
for average-case distortion and systematic MDS {maximum distance separable) codes for worst-case 
distortion, and prove optimality results for the corresponding achievable distortion regions. We then 
use the binary erasure multiple descriptions setup to propose a layered coding framework for multiple 
descriptions, which we then apply to vector Gaussian multiple descriptions and prove its optimality for 
symmetric scalar Gaussian multiple descriptions with two levels of receivers and no excess rate for the 
central receiver We also prove a new outer bound for the general multi-terminal source coding problem 
and use it to prove an optimality result for the robust binary erasure CEO problem. For the latter, we 
provide a tight lower bound on the distortion for £ messages for any coding scheme that achieves the 
minimum achievable distortion for /c < £ messages. 

I. Introduction 

While the information-theoretic study of network capacity has played a pivotal role in the 
development of wireless communications O, network rate-distortion theory has had a much 
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smaller impact on the design of practical systems. The reason for this is arguably two-fold. 
First, the mathematically challenging nature of network source coding has hindered progress 
toward understanding the fundamental limits of lossy data compression. The rate regions of many 
important network source coding problems have yet to be characterized and solutions for even 
simple networks are analytically involved. Second, prominent network source coding problems 
often are poor models that abstract away key properties of practical systems. In particular, 
such models often fail to accurately capture the distortion resulting from source quantization in 
practical systems. 

This paper attempts to circumvent these two issues by focusing on the use of the erasure 
distortion measure [|2l p. 370] for a binary source. The erasure distortion measure is well-suited 
for digital sources since it does not permit the decoder to make errors in its reconstruction of 
the source, but allows it to declare an erasure for any source symbol about which it is uncertain. 
Errors in digital data streams generally wreak havoc unless detailed knowledge of the digital 
representation is used to minimize their impact. Erasures, however, are tolerable since they can 
be detected by higher-level applications, which can either interpolate to fill in the missing data 
or wait until enough data is received to correct all of the erasures. Erasure formulations should 
also be useful as starting points for the design of practical codes for network rate-distortion. In 
the theoretical development of modern channel codes like LDPC, many of the code designs and 
performance characterizations were first established for the erasure channel [31. 

This paper looks at the binary erasure version of an important network source coding prob- 
lem, the multiple descriptions (MD) problem [|4l- lfT3l . Multiple descriptions is a source coding 
technique in which multiple encoded descriptions of a single source sequence are sent to the 
decoder over separate channels. This is an effective way to deal with channel failure and packet 
loss in packet networks, particularly in the case where retransmission of lost packets is not 
feasible (e.g., audio/video streaming) and the decoder must reconstruct the source with only the 
packets it has successfully received. The MD problem also constitutes a reasonable model for 
transmission of digital data (images, video, and sound) over peer-to-peer networks. 

An important regime within MD is that of no excess rate, i.e., the sum rate required to achieve 
distortion D at the receiver equals R{D), where R{-) is the Shannon rate-distortion function. 
This is a useful regime to study, since it allows us to not sacrifice end-to-end performance for 
intermediate performance (i.e., when the number of received descriptions is less than the number 
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required to achieve distortion D). For most sources, the no excess rate regime is characterized by 
poor intermediate performance (e.g., [5]): if a coding scheme is near-optimal for k receptions, 
it often yields high distortions for m < A; receptions. For binary erasure MD, however, it is 
possible to obtain good intermediate performance under no excess rate. 

A. Results 

We focus on binary erasure MD with no excess rate for every k out of n descriptions, i.e., 
any subset consisting of k messages must have a total rate of R{Dk), where Dk is the distortion 
constraint the decoder must obey when k messages are received. We consider symmetric descrip- 
tions, i.e., the rates of the n descriptions are the same and the distortion constraint depends only 
on the number of messages received. In fact, no excess rate implies symmetric descriptions for 
A; < n: if every k out of n descriptions have sum rate R{Dk), then each rate must be R{Dk)/k. 
We examine two distortion criteria; an average-case distortion criterion, which measures the 
reconstruction fidelity by the average of the per-letter distortion over all source sequences, and 
a worst-case distortion criterion, which measures the reconstruction fidelity by the maximum 
of the per-letter distortion over all source sequences. The average-case criterion is the standard 
criterion used in the literature. The worst-case criterion is less commonly used but arguably more 
appropriate in this setting. It is a universal distortion measure and is insensitive to the source 
model since it does not a require a source distribution. Our main contributions are: 

1) applying the binary erasure model to multiple description coding and focusing on the 
worst-case distortion criterion, 

2) proposing, for all n and k, coding schemes for both average-case and worst-case distortion 
criteria and characterizing their achievable distortion region when m < k descriptions 
are received at the decoder. The scheme for average-case distortion is based on random 
binning and can be viewed as of a concatenation of (n, 1) and (n, k) source-channel erasure 
codes ifTOll . The scheme for worst-case distortion is a practical zero-error coding scheme 
based on MDS {maximum distance separable) codes. 

3) providing, for both average-case and worst-case distortion criteria, a tight lower bound on 
the distortion when a single message is received at the decoder. For worst-case distortion, 
the outer bound holds for all n and k. Moreover, we show that the MDS coding scheme 
is Pareto optimal in the achievable distortions Di, . . . , Dk for all n and k, and, for certain 
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ranges of n and k, is also optimal when more than one message is received at the decoder. 
For average-case distortion, our outer bound holds, modulo a closure operation, for all 
n and k satisfying [l — ^ ^- addition, for n > 3 and /c = 2, we provide an 
outer bound on the optimal single-message distortion that differs by exactly 1/n from 
the distortion achieved by the random binning scheme. Our results for the special case 
in which there is no distortion for k messages (i.e., any k messages allow the decoder 
to construct the original source sequence completely) have appeared in lfT4l (average-case 
distortion) and [[TSl (worst-case distortion). 

4) proposing a coding scheme, based on the binary erasure MD coding schemes, for vector 
Gaussian MD and showing that it is optimal for scalar Gaussian MD with two levels 
of receivers and no excess rate for the central receiver. The scheme involves quantizing 
the vector Gaussian source according to a given quadratic distortion constraint and then 
transmitting the quantized version over the n channels according to the aforementioned 
binary erasure coding schemes. This shows that the binary erasure coding schemes can be 
used as part of a more general, layered coding scheme for multiple descriptions with a 
generic source distribution and arbitrary distortion metric. 

5) proving a new outer bound for the general multi-terminal source coding problem that 
improves upon the outer bound in [|29ll . and 

6) providing, for the robust binary erasure CEO problem with symmetric rates, a tight lower 
bound on the distortion for £ messages for any coding scheme that achieves the minimum 
achievable distortion for k < £ messages. The robust binary erasure CEO problem is a 
generalization of MD in that the encoders observe erased versions of the source instead of 
the source itself. This problem constitutes a reasonable model for decentralized peer-to- 
peer networks in which peers can generate new descriptions based on their partial copies 
of the source file. 

B. Relation to Prior Work 

An achievable rate region for the 2-description MD problem was first provided by El Gamal 
and Cover This region was shown to be tight for a scalar Gaussian source and quadratic 
distortion measure by Ozarow (51, and for a discrete memory less source (DMS) with no excess 
rate for two descriptions by Ahlswede jH. Zhang and Berger Q obtained a rate region for 
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the 2-description case that contained points strictly outside the El Gamal-Cover rate region. 
Venkataramani, Kramer and Goyal provided a rate region for the n-description case HI, which 
was improved upon by Pradhan, Puri, and Ramchandran BH, [[TOl . Tian and Chen proposed a 
coding scheme for the n-description case, with symmetric rates and distortion constraints, that 
combined a channel coding component with a source coding component to attain rate-distortion 
points outside the region proposed in [9J in the Gaussian case [11 J. Wang and Viswanath 
derived the minimal achievable sum rate for vector Gaussian MD with individual and central 
receivers [fT2l . More recently, Chen characterized the rate region of scalar Gaussian MD with 
individual and central distortion constraints [fT3|. 

Multiple descriptions with no excess rate is a generalization of the problem of successive 
refinement lfT6l . [fTTll . [fTSl . in which descriptions received in addition to the minimum number 
required to reconstruct the source with a given distortion are used to improve the quality of 
reconstruction. The MD problem is also similar to the problem of lossy packet transmission 
considered by Albanese et al. [fT9ll . They propose a coding method to deal with packet loss in 
erasure networks that involves assigning a priority level to messages. The messages are encoded 
into packets, and the priority level determines the minimum number of packets required to 
reconstruct the message. Other work on similar problems include symmetric multi-level diversity 
(MLD) coding [|20il . in which K sources, each with a different level of importance, are encoded 
by K encoders. The decoders have access to only a subset of the encoded descriptions, and 
each decoder attempts to reconstruct the k most important sources, where k is the number 
of descriptions that are accessible to it. More recently, Mohajer et al. [[2T| have considered a 
variation on symmetric MLD coding in which 2^ — 1 sources are encoded by K encoders, and 
have characterized the rate region for i^' = 3. 

Our binary erasure MD problem with no excess rate and no distortion for every k out of n 
messages is particularly significant in the context of peer-to-peer networks, since it can be used to 
study the tradeoff between the performance of two competing technologies: fountain codes [|22l . 
[|23ll and BitTorrent [24J. For large n and small k, the MD problem mimics rateless fountain 
codes, since out of a large number of descriptions, only a handful must be received (collected) 
in order to construct the source with zero distortion. Fountain codes are known to work well in 
erasure networks, but they usually have poor intermediate performance. Sanghavi [|25l provides 
an outer bound for rateless codes on the fraction of source symbols that can be decoded as a 
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function of the number of encoded symbols received. For k = n, the MD problem resembles the 
BitTorrent, where all of the relevant packets must be received to allow for complete reconstruction 
of the source. The BitTorrent provides good intermediate performance but suffers from the 
"coupon collector" problem; the initial pieces of the source can be acquired relatively rapidly, 
but it takes much longer to collect the final pieces. By varying n and k in the binary erasure MD 
model, therefore, the middle ground between fountain codes and the BitTorrent can be explored. 
The rest of this paper is organized as follows. In Section |Il| we formulate the n-channel binary 



erasure MD problem. Sections III and IV are devoted to our results for average-case distortion 
and worst-case distortion, respectively. In Sections |V] and |VI[ we describe our results for vector 
Gaussian MD and the robust binary erasure CEO problem, respectively. 

II. The n-CHANNEL Binary Erasure Multiple Descriptions Problem 

Let {Xt}'^i be a memory less uniform binary source, with the random variables Xt taking 
values in the alphabet X = {+,—}. Let X be the reconstruction space {+,—,0}, where 
denotes the erasure symbol, with an associated distortion measure d : X x X {0, 1, oo} such 
that 



f 

\i X = X 



d{x, x) 



1 if X = 
oo otherwise. 

The above per-letter measure is known as the erasure distortion measure ^ p. 370] . A encoder 
is a function ff^ : Af' ^ {1, . . . , uf^}. A decoder is a function g^^ : Ilfceycll' • • • ' ^k^) ^ 
where K, is the set of descriptions received. 

Let M = {1, . . . ,n}. The n-channel multiple descriptions problem, illustrated in Figure [Tj 
can be formulated as follows. There are n encoders. Encoder fP, i G A/", encodes and transmits 
a description of a length-/ source sequence over channel i. The receiver either receives this 
description without errors or it does not receive it at all. Excluding the case where none of 
the descriptions is received, the receiver may receive 2" — 1 different combinations of the n 
descriptions. Thus it can be represented by the 2" — 1 decoding functions gl^\ JC C J\f, /C 7^ 0. 
Based on the set of descriptions received, the receiver employs the corresponding decoding 
function to output a reconstruction of the original source string subject to a distortion constraint. 
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We consider symmetric descriptions, i.e., each description has the same rate and the distortion 
constraint depends only on the number of descriptions received. 

We measure the fidelity of the reconstruction using two distortion criteria: an average-case 
distortion criterion, under which distortion is measured by taking the average of the per-letter 
distortion over all source sequences, and a worst-case distortion criterion, under which distortion 
is measured by taking the maximum of the per-letter distortion over all source sequences. We 
define achievability for the two criteria as follows. Let = g^: {{f}:\x'') : k G /C}) be the 
reconstruction sequence corresponding to the source sequence X'. 

Definition 1 (Average-case distortion). The rate-distortion vector (R, Di, . . . , Dn) is achievable 
if for some I there exist encoders ff \ i & Af and decoders g^^ , /CCA/", /C 7^ 0, such tha^ 

-R > y log Mf' for all i, and 



Dk > max E 

K::\IC\=k 



1 ' 



t=i 



We use IZVavg to denote the set of achievable rate-distortion vectors and IZVavg to denote its 



vg 



closure. 



Definition 2 (Worst-case distortion). The rate-distortion vector (R, Di, . . . , Dn) is achievable 
if for some I there exist encoders ff\ i G A/" and decoders g^^, /CCA/", /C 7^ 0, such that 

-R > y log for all i, and 



Dk > max max 

JC:\K:\=kx'exi- 



1 

7 'y^,d,{Xt,Xjc,t 



t=l 



We use nV 



worst 



to denote the set of achievable rate-distortion vectors. We describe our results 



for average-case distortion in the next section and for worst-case distortion in Section IV For 
both distortion criteria, we consider the case where there is no excess rate for every k out of n 
descriptions, i.e., kR = R{Dk) = 1 — D^, where R{-) is the Shannon rate-distortion function. 
Thus R= {l—Dk)/k. We will henceforth use R to denote (l—Dk)/k. Our goal is to characterize 
the achievable distortions Z^i, . . . , Dn for both distortion criteria. 



'ah logarithms and exponentiations in this paper have base 2 unless explicitly stated. 
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Fig. 1. The n-channel multiple descriptions problem 



It should be pointed out that the A; = n case is particularly simple. Let Di, i e jV be the 
distortion constraint when the receiver receives i messages. No excess rate for n descriptions 
dictates that the sum-rate of the n messages is exactly (1 — which in turn implies that the 
rate of each message is (1 — Dn)/n. The problem then reduces to characterizing the optimal 
Di, . . . ,Dn. Consider a coding scheme that takes a source string of length I and erases the 
last IDn bits. The remaining 1{1 — bits are divided into n disjoint parts, each consisting 
of 1{1 — Dn)/n bits. Encoder i transmits the l{\ — Dn)/n bits in the i*^ part to the decoder 
over the i*^ channel, with erasures in places of the remaining / — /(I — -D„)/n bits. Thus upon 
reception of any k descriptions, the decoder can reconstruct kl{l — Dn)/n bits of the original 
source string. Clearly, this scheme achieves — 1 — k{l — Dn)/n under both the average- 
case and worst-case distortion criteria. Moreover, for any code that achieves the rate-distortion 
vector (1 — Dn/n, Di, . . . , every description has rate (1 — Dn)/n and therefore any set of 
k message can reveal no more than a fraction k{l — Dn)/n bits of the original source string. 
Thus 



max E 

K:K=k 



t=i 



>l-k{l- Dn)ln, 



and 



max max 



1 ' 



>\-k{\- Dr,)/n. 



Thus the aforementioned coding scheme achieves the optimal Di, . . . , D„ under both the average- 
case and worst-case distortion criteria. 
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We use the insight obtained from the k — n case to construct codes for the more compUcated 
case in which k < n.No excess rate for a particular set of k descriptions requires that information 
transmitted over the corresponding channels be independent. Since we impose no excess rate 
for every size-Zc subset of descriptions, information transmitted over any k channels must be 
mutually independent. The coding scheme for k — n ensures that this condition is met by 
dividing an erased version of the source string into n disjoint (and therefore independent) parts 
and transmitting them uncoded over the n channels. This strategy of sending independent uncoded 
bits works as long as the bits transmitted over each channel are disjoint. In particular, if R = 
(1 — Dk)/k < 1/n (equivalently, Dk > 1 — k/n), the source string can always be divided into n 
disjoint, equal parts, each containing a fraction R of the total number of bits. If < 1 — k/n, 
however, then R > 1/n and it is not possible to divide the source string into n disjoint parts 
each containing a fraction R of the total number of bits, since each part must then contain more 
than 1/n of the total number of bits. Transmitting uncoded bits, therefore, will only be optimal 
for a rate up to 1/n only; in order to achieve a rate larger than 1/n, additional information 
about the source must be transmitted along with each description, and this information must be 
mutually independent for every set of k descriptions. 

The threshold = 1 — k/n therefore plays an important role in our coding schemes for both 
average-case and worst-case distortions. If Dk > 1 — k/n, our coding scheme is based solely 
on the transmission of independent uncoded bits over the n channels as described above. If 
D). < 1 — k/n, then in addition to sending uncoded bits, we employ random binning (for average- 
case distortion) and MDS codes (for worst-case distortion) to communicate additional information 
about the source sequence. The random binning component works by randomly binning an erased 
version of all possible source sequences at each encoder. Each encoder transmits uncoded bits 
from the observed source sequence along with the bin index of the corresponding erased version. 
The decoder uses the uncoded bits and the bin indices to output a partial reconstruction of 
the source sequence. Decoding the binned erased version in particular allows the decoder to 
reconstruct source bits other than the ones it receives uncoded. The average-case distortion 
scenario is conceptually simple, but provides weaker guarantees on optimality. The MDS coding 
scheme for worst-case distortion is based on a similar idea (transmission of uncoded bits plus 
encoded information about an erased version of the source string), but as we will see later, the 
worst-case distortion scenario provides much stronger guarantees on optimality than average- 
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case distortion. The coding schemes for average-case and worst-case are described in detail in 
Sections |III-A| and |IV-A[ respectively. 



III. The Average-case Distortion Criterion 
A. An Achievability Result 

Definition 3. Given n, k < n, and Dk G [0, 1], define 

R = {R,l- R,1-2R,...,1- {k- l)R, Dk, Dk - R, Dk - 2R, . . . , Dk - (n - k)R) , and 

( 1 2 k-\ (n-k-\\ (n-k-2\ / 1 \ 

/?= i?,l--, , Dk, rr Dk, 7- )Dk,...,( t Dk, 

\ n n n \ n — k J \ n — k J \n — k J 

The following theorem shows that it is possible to achieve good intermediate performance 
when m < k descriptions are received at the decoder. 

Theorem 1. Let Dk G [0, 1]. For any n and k < n, if Dk > 1 — ^, then R G TZVavg- If 
Dk<l - -, then R G TO 



avg- 



Proof Case I: Dk>l-^ 
Assume without loss of generality that Dk is rational (if Dk is irrational, then we can prove 
achievability for a sequence of rational distortions in [1 — k/n, 1] converging to Dk and take 
limits). Then there exists a positive integer /' such that I'R is a positive integer. Choose a 
blocklength / = anl', where a is any positive integer. Observe a length-/ source sequence X', 
and divide into n disjoint parts such that each part contains l/n = al' bits. (The division is 
the same regardless of the source realization.) Label the parts Xi, i G J\f. Choose IR bits from 
each of the n parts (since Dk > I — ^, IR < and therefore IR bits can be chosen from each 
part). Denote by yi the set of IR bits chosen from Xi. Transmit yi uncoded over the i^^ channel. 

The decoding is trivial. If m descriptions, say {yi, . . . ,ym), are received, output as the 
reconstruction of X', where X^ is such that the mlR bits corresponding to (yi, . . . , ym) are non- 
erased and the other {l—mlR) bits are erasures. The distortion, therefore, is {l—mlR)/l = 1—mR. 
When k descriptions are received, the distortion is 1 — kR = Dk- Thus the rate-distortion vector 
{R,l-R,l-2R,...,l-{k- 1)R, Dk, Dk - R, Dk - 2R, . . . , Dk - (n - k)R) G TZVavg, and 



therefore also lies in TZVavg- 



DRAFT 



11 

Case II: Dfc < 1 - ^ 

The scheme for this case is an extension of the scheme for Case I. It has two components; random 
binning and transmission of uncoded source bits. An erased version of every source sequence 
is binned separately at each encoder. The observed source string is divided into n disjoint parts. 
Each uncoded part is then sent on one of the n channels along with the corresponding bin index 
of the erased version of the source. If less than k descriptions are received, the decoder outputs a 
partial reconstruction based solely on the uncoded parts; if k or more descriptions are received, 
the decoder outputs a reconstruction based on the uncoded parts and the bin indices. 

Assume again that Dk is rational. Choose e > 0, and define /?' = (! — Dk) /k — l/n + e. Since 
Dj. is rational, there exists a positive integer I' such that l'Dk/{n — k) is an integer. Choose a 
blocklength / — anV , where a is any positive integer. 

Random binning: Construct n sets of bins such that every set contains 2'^' bins. For every 
length-Z source string e A"', construct an erased version as follows. Divide into n disjoint 
parts such that each part contains l/n — aV bits (the division is done identically for all source 
sequences). For each part, replace the last lDk/{n — k) bits by erasures (since Dk < 1 — 
each part contains l/n > lDk/{n — k) bits). Assign the resulting erased version x[ uniformly 
at random, and independently from other strings, to one of the 2'^' bins in the i*'* set, for all 
i & N. The assignment is done only once for each erased version. This is important because 
multiple source strings can have the same erased version. Denote the assignments by Fj. 

Encoding: Let be the observed source sequence. Divide into n disjoint parts each 
containing l/n bits as described above. Label the parts Xi, i e N". Let bi — Ti{X^) be the index 
of the bin containing the erased version of X' in the i*'' bin set. Transmit (Xj, 6j) over the i^'^ 
channel. 

Decoding: If m descriptions, say {(Xi, 6i ),..., (X^, 6^)}, are received, where m < k, 
output Xl^ as the reconstruction of X\ where X^^ is such that the ml/n bits corresponding to 
{Xi, . . . , Xjn) are non-erased and the other (I— ml/n) bits are erasures. If m > A; descriptions are 
received, say {{Xi, h),..., {Xm, bm)}, choose any k descriptions, say {{Xi, bi),..., {Xk, bk)}, 
and search the bins . . . , bk) for a sequence y such that Ti{y) = bi, i = 1, . . . ,k, and y is con- 
sistent with the partially revealed source string (Xi, . . . , X^). Output X'„ = {(Xi, . . . , X^)} U 
{y} as the reconstruction of X'. (Thus the non-erased bits in X^ are the bits revealed by 
(Xi, . . . , Xm) or by the erased version y, or both.) There is guaranteed to be at least one such 
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sequence y in the bins indexed hy bi, ... ,bk. If there is more than one such sequence, output the 
all-erasure string as the reconstruction of XK (This will suffice to meet our distortion constraint.) 

Error analysis: We say an error Es has occurred at the decoder if, for a set 5 = {si, . . . , Sk} 
of k descriptions, there exists an erased version y ^ such that TsXv) = ^sX^]?) for all 
Si E S and y is consistent with (Xg^, . . . ,Xs^). Let Cs be the set of erased versions that are 
consistent with {Xg-^, . . . , Xg^.). Define E — (J^ i^i^^ Es. We bound Pr(£') as follows. 

Fr{E) < y ^ Fr{Es) 



E 

5,|5|=A: 



S,\S\=k 



S,\S\=k 



S,\S\=kyjLxi 

yeCs 



5,|5|=fe 



X S,\S\=k 

E^^(^) E 2" 

a; 5,|5|=fc 



A;, 

We now show that for any e > 0, the (n + l)-tuple (i? + e, 1-^ + e, l-| + e, + 
e,Dfc + e, (iii^)D,, + e, (r^)D,, + e, . . . , + e) is achievable, and thus {RA-^^A^ 

l...,l-!^,Dk, (^)^ik, (^)^fe, ■ ■ ■ , (^)^ik, 0) e Fix e > and define R' 

as above. In our scheme, any description (Xj, bi) has rate R — l/n + R', where 1/n is the rate 



due to Xi and i?' is the rate due to binning. Thus R^l/n + -l/n + e) ^ {I- Dk)/k + e. 
Moreover, if m < A; descriptions are received, the decoder outputs ml/n bits as revealed by the 
m descriptions and the other {I — ml/n) bits as erasures. Thus Dm = 1 — m/n < 1 — m/n + e. If 
k descriptions are received, say S = {si, . . . , s^}, the decoder either outputs an erased version 
of the correct source sequence if E^ occurs, or outputs an all erasure string if Es occurs. If Es 
occurs, then the decoder receives kl/n bits uncoded from the k descriptions, and is able to figure 
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out a further {n — k){l/n — lDk/{n — k)) = /(I — k/n — D^) bits by using the bin indices to 
decode the erased version of the source sequence. Hence the maximum per-letter distortion over 
sets of k descriptions is 1 — [k/n + 1 — k/n — Dk) = Dk if occurs, and 1 ii E occurs. Let 
ds^x be the per-letter distortion achieved using the set S of descriptions if the observed source 
string is Thus 



which can be made smaller than Dk + thy letting a — )• oo. Thus + e is achievable for some 
sufficiently large /. If m > A; descriptions are received, then the decoder receives ml/n bits 
uncoded, and is able to figure out a further {n — m){l/n — lDk/{n — k)) bits by decoding the 
binned erased version. Thus, if E^ occurs, the maximum per-letter distortion is l — m/n—({n — 
m)/n — [n — m)Dk/{n — k)) = (^j^)Dfc, and by the same analysis as above, a distortion of 
(^j^)Dfc + f can be achieved for some sufficiently large /. ■ 

B. Optimality Results 

In this section we present optimality results for the random binning coding scheme described 
in the previous subsection. We first establish some preliminary results in Appendix |A] which 
will be used in the proofs of the following theorems. Our optimality results for the average-case 
deal deal primarily with single-message optimality, i.e., when only one message is received at 
the decoder. In the next section, we shall see that stronger optimality results can be established 
for the worst-case distortion criterion. 

The following theorem shows that when only one message is received at the decoder, the 
scheme is optimal, modulo a closure operation, for all n and k satisfying (l — ^)^<|. Recall 
that, given Dk, we use R to denote (1 — Dk)/k. 

Definition 4. For any fixed D^, define 



E/,g max Ex[(i5,x] < E/.gEx[ max ds^x] 

■"^ S]S\=k ^S,\S\=k 



Pr(E) + Du{l- Pr(E)) = (1 - Dk) Pr(E) + D^ 




Dl = inf {Di : (i?, Di, . . . , Dfc, . . . , AO G 
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Theorem!. For any n and k < n, ifD^ > 1—^, thenforany {R,Di, . . . ,-Dfc, . . . , ^ T^l^avg, 
Dm > 1-mR for all m e X. IfDk < 1-^, Dk is rational^ and (l - < ^, then Dl > 1-^. 

Proof: See Appendix |B] ■ 
We note that (l — ^)'^ < | implies k > iog(^/„_i) := A(n). Since \{n)/n 1/loge as 
n — oo, the second part of Theorem |2] provides a lower bound on for a large range of k 
when n is large. 

The following theorem proves single-message optimality for the coding scheme when n = 4 
and k = 2. This case is not included in Theorem |2l 

Theorem 3. Let Di. < 1 — - and rational. If n = A and k = 2, then > 1 ~ -. 

Proof: See Appendix |Cj ■ 
Theorem [2] handles the regime in which k is large. We now study the other extreme, i.e., 
when k is small. In particular, we look at the k = 2 case. The following theorem provides a 
lower bound on the optimal single-message distortion for n > 3 and k = 2. This lower bound 
differs from the distortion achieved by our coding scheme by exactly 1/n, and thus becomes 
progressively tighter as n increases. 

Theorem 4. Let Dk < 1 — and rational. If k = 2, then for n > 3, > 1 — |. 

Proof: See Appendix |D} ■ 
We conjecture that the lower bound in Theorem |4] is not tight and that our scheme is in fact 
optimal. Evidence of this is provided by Theorem [3} 

IV. The Worst-case Distortion Criterion 

We turn now to the worst-case distortion criterion. We begin by presenting a practical, zero- 
error coding scheme based on systematic MDS codes that works for finite blocklengths. Like the 
random binning coding scheme for average-case distortion, the MDS coding scheme consists of 
two parts - uncoded bits and an MDS -code component. The uncoded component is similar to 
the uncoded component of the average-case coding scheme. The difference lies in the encoded 

^For this theorem and subsequent theorems in this subsection, we consider rational values for since any code over a finite 
blocklength can yield only rational distortions. 
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component; instead of randomly binning an erased version of the source and then sending bin 
indices to the decoder (as the average-case distortion encoder does), the worst-case distortion 
encoder encodes the erased version using an (n, k) systematic MDS code. The decoder outputs 
the uncoded bits and the bits revealed by the systematic part of the MDS code as the source 
reconstruction if less than k descriptions are received. If k or more descriptions are received, 
the decoder uses the uncoded bits and the bits revealed by the systematic part of the MDS code 
to decode the encoded erased version by applying an MDS decoding algorithm. The following 
subsection discusses the achievable distortion region of the MDS coding scheme. 

A. An Achievability Result 

Theorem 5. Let Dk be a rational number in the interval [0, 1]. For any n and k < n, if 
^fe > 1 - |, then R e TZV^orst- If Dk < 1 - ^, then R e TZV^orst- 

Proof: Case I: > 1 — |, Dk rational 
Since Dk is rational, there exists a positive integer I' such that I'R is a positive integer. Choose 
a blocklength I — anl', where a is any positive integer. Observe a length-Z source sequence X^, 
and divide into n disjoint parts such that each part contains l/n — al' bits. (The division is 
the same regardless of the source realization.) Label the parts Xi, i & M. Choose IR bits from 
each of the n parts (since Dk > I — ^, IR < and therefore IR bits can be chosen from each 
part). Denote by yi the set of IR bits chosen from X^. Transmit yi uncoded over the i*'' channel. 

The decoding is trivial. If m descriptions, say (yi, . . . ,ym), are received, output X!^ as the 
reconstruction of X\ where X^^ is such that the mlR bits corresponding to (yi, . . . , y^) are non- 
erased and the other (I — mlR) bits are erasures. Since the reconstruction sequence has I — mlR 
erasures regardless of the source sequence, the worst-case distortion Dm is {l—mlR)/l = 1—mR. 
When k descriptions are received, the worst-case distortion is 1 — kR = Dk- Thus the rate- 
distortion vector {R,1-R,1-2R,. . . ,l-{k-l)R,Dk,Dk-R,Dk-2R,. .. ,Dk-{n-k)R) e 

T^T^ worst • 

Case II: Dk < 1 - ^, Dk rational 

For this case, we present an achievability scheme based on MDS (maximum distance separable) 
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codeqj Just as the achievability scheme for the average-case, this scheme has two components; 
uncoded bits and an MDS-code component. Let m be the smallest integer such that 2™ > 
and is an integer (such an m exists because Dk is rational). Define q = 2™, and 

construct a g-ary MDS code of length q — 1 and dimension k. By repeatedly puncturing this 
(g — 1, k) MDS code, we obtain a punctured MDS code of size (n, k) [1271 p. 190]. The punctured 
coordinates are revealed to the decoder. Let Gi be the generator matrix of the punctured (n, k) 
MDS code, and assume without loss of generality that Gi is systematic, i.e., Gi is of the form 
[Ifc|A], where 1^ is the k x k identity matrix and Aisakxn — k matrix over the finite field 
GF(g). Construct matrices G2, . . . , G„ by shifting the columns of Gi to the right, i.e., Gj is the 
matrix formed by shifting the columns of Gi by i — 1 places, with the last i — 1 columns of Gi 
wrapping around. In particular, if Gi = . . . An], where Ai, . . . ,An are the columns of A, 

then Gj = [An-i+2 ■ ■ ■ An\lk\Ai . . . An-i+i]. 

Encoding: Let be the observed source string, of length I = ^^'^^^~yf^ bits. Divide X' into 
n disjoint parts, each of length ^^}'^J^}_j^ bits. (The division is done the same way regardless of 
the source realization.) Let Xj, i E M denote the last IDk/ (n — k) bits of the i^^ part. Construct 
an erased version X^ by replacing the last lDk/{n — k) bits in each of the n parts by erasures. 
Thus Xg has /(I — ^^) = mnk bits. Each of the n parts of X' has mk bits and can therefore be 
treated as a concatenation of k binary strings of length m, such that each of these binary strings 
is the binary representation of an element in GF(g). Thus each of the n parts of X' can be 
mapped to a vector of length k in GF(g). Label these vectors pj, j E J\f. Let yj = PjGj, j E M. 
Thus the Uj are length-n vectors in GF(g). Let yji = pjGji denote the i*'* element of yj (here 
Gji is the i*^ column of Gj). Transmit (Xi,yji : j E M) over the i^^ channel. 

Decoding: Suppose c < k descriptions are received at the decoder. Let M. E M denote the 
set of indices of the received descriptions. Assume without loss of generality that i E M.. Thus 
the decoder receives X, and y^i = PjGji : j E M. Thus lDk/{n — k) bits are revealed to the 
decoder via Xj. Now for a fixed i, exactly k of the G^, j E M, (in particular, Gj_fc_|_i, . . . , Gj) 
will have their i^^ column in the systematic part. Thus one symbol from k of the pj, j E M, 
can be decoded. By mapping these decoded symbols to their binary representations, the decoder 

^An (n, k) MDS code is a linear code that satisfies the Singleton bound, i.e., the Hamming distance between any two 
codewords is n — fc + 1. Reed-Solomon codes, for instance, are MDS codes. 
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can obtain a partial reconstruction of X. Let Xj represent the reconstructed source bits due to 
the z*'* description. Output (Xj : i E Ai) as the reconstruction of XK If m > k descriptions are 



received, then any k descriptions reveal k symbols from each of the yj, j E M. Also, since the 
punctured coordinates are known to the decoder, it can construct a longer codeword from every 
partially received codeword by adding erasures in place of the punctured coordinates. The longer 
codewords can be treated as codewords from the original {q — l,k) MDS code. The original MDS 
code can subsequently be decoded by applying an erasure decoding algorithm [|27l Ch. 9] and all 
the pj vectors can be recovered. Mapping the pj vectors to their binary representations reveals the 
erased version of the original source string X'. Output {(Xi, . . . , Xm)}U{Xg\(Xi, . . . , X^)} 
as the reconstruction of X'. 
Analysis: We now argue that the above scheme achieves the rate-distortion vector {R, 1 — — 



!^,Dk, (^) A, {'^)Dk, {^)Dk, 0). For any source string X\ every 



description (say the i^'^ description) consists of {Xi,yji : j G A/"). Xj consists of lDk/{n — k) 
bits. Now since yji is an element of GF(g), it can be represented by m bits. Thus {yji : j E M) is a 
length-ri vector in GF(g), and can be represented by mn bits. Every description therefore consists 
of mn + lDk/{n — k) bits. Since the source string consists of / = mnk{n — k)/ {n{l — Dk) — k) 
source symbols, every description has rate 



Moreover, every description received at the decoder reveals ID^/iji — k) bits via Xj, and exactly 
one symbol from k of the pj, j E M. Each of these k symbols is an element of GF(g) and can 
be represented by m bits. Thus every description reveals ID^/ {n — k)+ mk bits to the decoder. 
(We note that the bits revealed by any two descriptions are disjoint. The uncoded bits X^ and 
Xfe are disjoint by definition for any two descriptions a and h. Now suppose descriptions a and 
h revealed the same symbol from some pj. Then y^a = PjGja = PjGjb = yjb, which implies 
a = h.) Thus if c < A; descriptions are received, the decoder can reconstruct c{lDk/ {n — k)+mk) 
bits of the original source sequence. Thus 



mn + lDk/{n — k) 1 — Dk 



R. 




c{^,+mk) 



1 - 



cDk 
n — k 



cn{l — Dk) — ck 



n{n — k) 



c 



1 - 



n 
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If c > k descriptions are received, say descriptions l,...,m, then {Xi,...,Xm) reveal 
clDk/{n — k) bits. Moreover, the erased version of the source sequence, X^, can be reconstructed 
by applying the MDS erasure decoding algorithm. The bits revealed by (Xi, . . . , Xm) are disjoint 
from the bits revealed by X'. The total number of bits revealed, therefore, is clDk/ [n — k] +mnk. 
Thus 



1 - 



c-^ + mnk 

n—k 



I 



n — k 
n — c 



n(l-D,)-k 



n — k 



D,. 



Thus (i?, 1-^,1 



,...,1 



n — k ^ 

/ n—k— I ' 
71 ' V n—k y-^f^' V n—k 



k-1 



i-k-2- 



Dk,...,i^,)Dk,o)enn 



worst ■ 



Figure [2] depicts how the achievable distortion varies with the number of descriptions received 
at the decoder when = 0- 



1 (►. 



1 2 3 k-1 k k + 1 
Descriptions received 



Fig. 2. The achievable distortion region for Dk — 0. The achievable distortion decreases linearly with the number of descriptions 
received up to fc — 1 descriptions, and drops abruptly to zero upon reception of k or more descriptions. 



B. Optimality Results 

We now present optimality results for the MDS coding scheme described in the previous 
subsection. These optimality results are stronger than those for average-case distortion and yield 
a more complete characterization of the achievable distortion region. Since we are dealing with 
worst-case distortion constraints, the following results hold for any source distribution. 
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Theorem 6. For any n and k, ifD^ > 1 — | and rationa^ then for any (R, Di, . . . , D^, . . . , D„) E 

TlVyjorst, Dm > 1 — mR for all m E J\f. 

Proof: Let > 1 — K If a code achieves a certain distortion under worst-case distortion, 
then it will achieve that distortion under average-case distortion as well. The result therefore 
follows from the first part of Theorem [2} ■ 
The following lemma is integral to the proofs of our optimality results for worst-case distortion. 

Definition 5. Let X'- be a random vector taking values in XK An erased version of X'- is a 
random vector X\ taking values in X\ such that $t G {1,...,/} such that Xt = + cind Xt = — 
or Xt = — and Xt = +. 

Lemma 1. Let X[{X), X2{X), . . . ,X^(X) be erased versions of the source string G XK 
Suppose X^ is i.i.d. uniform over XK If for a// 1 G {1, . . . , /}, I{Xit{X); Xjt{X)) = V z, j G A/", 
i 7^ j, then 

-'^d{xt,Xit{x)) 



max } 



> n-\. 



t=i 

Proof: See Appendix |Ej ■ 
The following theorem proves that the MDS coding scheme is optimal for all n and k when 
a single-message is received at the decoder. 

Theorem 7. For any n and k, if Dk < 1 — ^ and rational, then for any (R, Di, . . . , , • • • , Dn) G 

1^1^ worst, Di > 1 — K 

Proof: See Appendix |Fj ■ 
The following theorem shows that the MDS coding scheme is Pareto optimal in the distortions 

Di, . . . , Dk-i- 

Theorem 8. For any n and k, {R,!-^!-^ . . . , 1-^, A, (^) A, (^) A, • • • , i^jD^., 
is Pareto optimal in Di, . . . , Dk-i, i.e., there does not exist {R', D[, . . . , D'^) G TZV^orst such 
that either R! < R, or R' < R, D'- < 1 - ^ for all I < i < k - 1 and D'- < 1 - ^ for at least 
one j, I < j < k — 1. 

''For this theorem and subsequent theorems in this subsection, we consider rational values for since any code over a finite 
blocklength can yield only rational distortions. 
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Proof: See Appendix [G} ■ 
The following theorem shows that for certain values of m, n and k, the MDS coding scheme 
is optimal when m messages are received. 

Theorem 9. For any n and k, ifm < | and m\n (m divides n), then for any (R, Di, . . . , Dk, . . . , Dr, 
TJT) D > 1 _ 

Proof. See Appendix |H} ■ 

V. A General Multiple Descriptions Architecture 

The schemes described in this paper provide a substrate that can be used to construct no-excess- 
rate multiple descriptions codes for a general source using only a point-to-point rate-distortion 
code for that source. We illustrate this idea for a Gaussian source, where the resulting scheme 
is optimal in a certain sense. The extension to arbitrary sources should be clear from the proof. 
Suppose that (Xj)^]^ is a memoryless Gaussian process, where is a vector of length and 
has a marginal distribution A/'(0,K^). The distortion for a source-reconstruction pair (X',X ) is 
measured as E } ELil^t - %)^t - %f ■ We compare distortions in the positive definite 
sense, i.e., )^ iff — ^ 0. 

Definition 6. The rate-distortion vector {R,Di, . . . ,Dn) is achievable if for some I there exist 



encoders fl'' : M^^' ^ {1, • • • , Ml'>}, i e N and decoders g)^> : UkeJci^^ } 
/CCA/", /C 7^ 0, such that 

-R > y log M.*^'^ for all i, and 



1 ' 

jY,(Xt-Xt,^t){Xt-X,c,tf 



t=i 



for all /CCA/", |/C| = /c, 



where ^^=E[Xyf\x^),ieK]. 



We use TZV gauss to denote the set of achievable rate-distortion vectors and TZV gauss to denote 
its closure. We consider symmetric descriptions, i.e., each description has the same rate Rg and 
the distortion constraint depends only on the number of descriptions received. We consider the 
case where there is no excess rate for every k out of n descriptions, i.e., kRg = R(Dk), where 
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R{-) is the Shannon rate-distortion function and 

1 |K 

-R(Dfc) = min - log 



X 



D 2 ° |D| 
s.t. D ^ Dfc and 
D ^ K,. 



Thus Rg = l-R(Dfc) bits/symbol. 

Theorem 10. M!!^, 2D,+in-2)K. ^ ^ ^k-l)D,+in-k+l)K. ^j^^^ ^^^j ^ ^ 



gauss- 



Proof: Fix Dfc and consider an integer /. We know from rate-distortion theory that there 
exists an integer /' > lR(Dk) such that any source sequence X' of / symbols can be compressed 
to a sequence Y'-' consisting of /' bits and then reproduced from F'' with distortion ^ Dk + el 
for / sufficiently large. Chose now a blocklength nl. The nl source symbols can be compressed 
into a binary sequence F"'' taking values in X, which can then be transmitted to the decoder 



over the n channels using the achievability scheme proposed in Section IV- A Thus every de- 
scription contains /' uncoded bits of F"'' . In particular, the decoder should be able to completely 
reconstruct F"'' upon reception of any k descriptions, i.e, there is no distortion for every k 
out of n descriptions (this corresponds to a special case of Theorem [T] with = 0). Thus 
every set of k descriptions must reveal nl' bits, and therefore the rate of a single description 
is i? = nl' /knl = I' /kl bits per symbol of X'. Moreover, since every description contains /' 
uncoded bits, the decoder can reconstruct ml' bits of F'' upon reception of m < k descriptions. 

We now argue that 2i.±i!^, . . . , (fe-i)D.+(n-^+i)K. ^ . . . , D,) G nV^auss- The rate 
of every description is i? = I'/kl. Moreover, any m < k descriptions reveal ml' bits of F"''. It 
follows from a time-sharing argument that upon receptions of m < k descriptions, the decoder 
can reconstruct X"^ with distortion When k or more descriptions are received, 

the decoder is able to reconstruct F"'' completely and can reconstruct X"' with distortion less 
than ^ Dfc + el. Now let / — )• oo. Then we can let /' — )• oo such that j — )■ R(Dj.) and e — t- 0. 
Thus R=i^ lR{D,) = Rg, and so M!^, . . . , W ^ . . . , D,) G 

T^T-^ gauss- ^ 

Next, we show that, for the special case of symmetric scalar Gaussian multiple descriptions 
with two levels of receivers (where one receiver reconstructs the source from any k out of 
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n descriptions with distortion and the second receiver reconstruct the source from all n 
description with distortion D„), and no excess rate for the second receiver, the aforementioned 
scheme achieves the optimal D^. It has been shown by Wang and Viswanath [28, Theorem 1] 
that given distortion constraints and D„, the symmetric multiple description rate for an Ltd. 
vector Gaussian source with mean and convariance K^; is 



R = sup - log 



Thus the sum rate of the n descriptions is 



2 







: n — k 1 


|Dn + K,| 




|Dn| 






k 



nR = sup ^log I — ' .n ' 1 • (1) 



Theorem 11. For scalar Gaussian multiple descriptions (i.i.d. A/'(0, a1) Gaussian source) with 
two levels of receivers (distortion constraints Dk and Dn, respectively) and no excess rate for 



the second receiver, Dk > -Dn + ^^al. 



Proof: Assume WLOG that o"^ = 1. Reducing ([T]) to the scalar case and using the no excess 
rate condition gives 



log — = sup - log 



which implies 



2 \Dn J A>0 2 \ D, 



= sup J log 

A>0 




— — 1 

Define /(A) = (1+^)^' (Dn+X) _ r^^^^ 
= suploge/(A) 

A>0 



/ Ti \ n 
sup - - 1 log,(l + A) + log,(Z^„ + A) - t: loge(^fc + A) 
x>o\k J k 



Dn + \ n , 1 + A 
sup logg ^ , , + r log. 



A>o 1 + A k ^'Dk + \ 



Dn-l\ n, 1-D 



sup log, 1 + ^ , , + - log, 1 + 



A>o V 1 + A / k ^"K Dk + \ 



Define 



( Dn-1 \2 ( l-Dk \2 
_ U+A ^ ^D^+X) 



2(1-1^1)2 2(1-11-^1)2 
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Using the fact that 

Incr M -I- t"! > T — 

2(1 - \x 



logg(l + x) > X — 1^3^ for |x| < 1 



we obtain 



D^ + X n (l-Du\ Dk + X 



1 + A - k \1-Dnj 
Now let A oo. Then j^g{X) ^ and 1. We thus have 

n fl- Dk 



1 > 



k \1-D 



^ k ^ n 



n n 



VI. Decentralized Encoding 

In this section we characterize the optimal distortion tradeoff for the robust binary erasure CEO 
problem. The robust binary erasure CEO problem is a generalization of the multiple descriptions 
problem studied earUer in that the encoders observe an erased version of the source instead of 
the source itself. In particular, let — Ni- X, i e Af, where X e X and A^i, . . . , are Li.d. 
Bernoulli with < Pr(A^i = 0) = p < 1. Thus the Yt take values in ^ = {+, — , 0}. A encoder is 
a function /j : Pt"' — )> |l, . . . , M^'} , i G J\f. A decoder is a function g^c : Y\keic • • • > ~^ 
X\ where /C C A/" is the set of messages received. There are n encoders. Encoder fi, i & J\f, 
observes y/ and transmits an encoded version of it over channel i. The receiver either receives 
this description without errors or is not able to receive it at all. Excluding the case where none 
of the messages is received, the receiver may receive 7^ — 1 different combinations of the n 
messages. Thus it can be represented by the 2" — 1 decoding functions qk., /C C jV", /C 7^ 0. Based 
on the set of received messages /C, the receiver employs the corresponding decoding function to 
output a reconstruction X]^ of the original source string subject to a distortion constraint. We 
consider symmetric rates, i.e., each message has the same rate R and the distortion constraint 
depends only on the number of messages received. 

DRAFT 



24 

We measure the fidelity of the reconstruction using a family of distortion measures, {d^}\>o^ 
where 

if X = a; 
d^{x,x) = <i if X = 
A otherwise. 

V 

We are particularly interested in the large-A limit. In this regime, d^ approximates the erasure 
distortion measure. We use this family of finite distortion measures because an infinite distortion 
measure is too harsh for this setup: it does not allow decoding errors at all, even those that have 
negligible probability. 

Definition 7. The rate-distortion vector {R, Di, D2, ■ ■ ■ , Dn) is achievable if there exists a block 
length I for which there exist encoders fi, i E Af, and decoders g/c, IC C Af, /C 7^ such that 

R>j\og for all i e M, and 



Dk> E 



for all subsets of messages /C, |/C| = /c. 



]Y.d\Xt,X^t) 
. t=i 

Let TZVcEoW denote the set of achievable rate-distortion vectors. Define 

00 

A>1 



(2) 



We use TZVcEO to denote the closure of TZVceo- Our main result is the characterization 
of the optimal distortion tradeoff for an arbitrary code with respect to the number of messages 
received. We show that if a code comes arbitrarily close to achieving the minimum achievable 
distortion Dk upon reception of k messages, then the distortion it can achieve upon reception of 
£ messages cannot be lower than D^f!'^ . Achievability can be shown by using a random binning 
scheme based on (n, /c) source-channel erasure codes, proposed in BH. The result therefore 
proves that (n, k) source-channel erasure codes are optimal for this setup. Informally, the scheme 
involves constructing a codebook Ci for Yi at encoder fi and then binning all the codewords 
independently and uniformly. Encoder fi observes and then sends the bin index of the 
corresponding codeword to the decoder. Upon receiving the messages, the decoder searches 
the corresponding bins and outputs a reconstruction of the source sequence based on the bits 
revealed by the codewords. If none of the decoded codewords reveal a particular source bit, then 
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the decoder just outputs an erasure in place of that bit. It can be verified that, for this scheme, 
if the distortion upon reception of any k messages is Dk, then the distortion upon reception of 

i/k 

any £ messages is Dj^ . The intuition is that if q is the probability that a particular bit is not 
revealed by a particular message, then the chance that k messages will not reveal that bit is g^, 
and the chance that £ messages will not reveal that bit is = {q^Y^^. 

Before proving the converse for this problem, we will state and prove an outer bound on the 
rate region of the multi-terminal source coding problem in the next subsection. We will then use 



this bound to prove our result in Section VI-B 



A. Outer Bound on the Rate Region of the Multi-terminal Source Coding Problem 

The term "multi-terminal source coding" typically refers to the problem of reconstructing two 
correlated, separately encoded sources, each subject to a distortion constraint. In this paper we 
use the term to refer to the more general model considered in [|29ll : we have an arbitrary number 
of sources Yi, . . . , F„, with Yi taking values in the set 3^j, encoders fi, i E M, a hidden source 
Yq which is not directly observed by any encoder or the decoder, and a side information source 
y^+i, taking values in the set 3^„+i, which is observed by the decoder but not by any encoder. 
In particular, {Yo^t, • • • , Y'n,t, Y'n+i,t}tZi is a vector-valued, finite-alphabet and memoryless 
source. Encoder fi observes a length-/ sequence of Yi and transmits a message to the decoder 
based on the mapping 

fP:yl^{l,...,MP}. 

We allow the decoder to reconstruct arbitrary functions of the sources Vi, . . . , Vj (with Vj, j = 
1, . . . , J taking values in the set V-,) instead of, or in addition to, the sources themselves. We 
also allow the decoder to reconstruct Vi, . . . ,Vj from subsets of messages fjc = {fk\ k G /C}, 
where /C C ff, /C 7^ 0. The decoder thus uses the mappings 

{g^) : 3^i+i X n {1, • • • , M« } ^ Vj, for /C C A/-, /C ^ 0, J = 1, . . . , J. 

keK. 

We thus have J distortion measures 

re+l 

i=0 

For every j = 1, . . . , J, we impose a common distortion constraint for all size-A; subset of 
messages used to reconstruct Vj. More precisely, for every j = 1, . . . , J, all (^) subsets of 
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messages of size k, when used to reconstruct Vj, must satisfy a single distortion constraint. Thus 
there are nJ distortion constraints in total. We will use the following notation and definitions from 
Let Y/c denote (Yk)keic, and Yic denote Y^iy. Moreover, Yi^a-.b denotes {Yi^a, ^j,a+i, • • • , Yi^b}- 



Definition 8. The rate-distortion vector 

(R, D) = i?2, • • • , Rn, Di^i, -D2,l5 • • • 5 -Dn.l, D12, ■ ■ ■ , -Dn,2, • • • , Di j, . . . , -Dn,j) 

w achievable if for some I there exist encoders ff\ i & M, and decoders ((7^)', /C C A/", /C 7^ 



, j = 1, . . . , J, such that 

Ri > jlogM^, i e Af, and 

(3) 



Dk 7 > max E 

K:\K\=k 



I 

t=l 



1 ' 

7 '^j(^0,t, Y/c,t, Yn+l,t, Vj^t) 



for j = 1,..., J. 



As in ["291, we use T^P^ to denote the set of achievable rate-distortion vectors and TZV^ to 
denote its closure. We use the following definitions from [|29l . 



Definition 9. Let Yq, Yi, . . . , Yn+i be generic random, variables with the distribution of the source 

at a single time. Let To denote the set of finite-alphabet random variables 7 = (f/i, . . . , f/„, Vi, . . . ,Vj,W,T) 

satisfying 

(i) {W, T) is independent of {Yq, Y^, l^+i), 

(ii) Ui ^ {Yi, W, T) o (Fo, Yic, Yn+i, Uicj, shorthand for "U,, (Y^, W^, T) anJ (Fq, Y^c, y;+i, U.c) 
form a Markov chain in this order", for all i G M, and 

(iii) (Fo, Y^, W) ^ (Uat, T) ^ (1^1, ... , V;). 

Definition 10. Let ip denote the set of finite -alphabet random variables Z with the property that 
Yi, . . . ,Yn are conditionally independent given {Z, l^+i). 

There are many ways of coupling a given Z E and 7 G To to the source. In this paper, we 
shall only consider the Markov coupling for which Z ^ (Yq, Yj\j-, Yn+i) ^ 7. We now state our 
outer bound. 

Definition 11. Let 

7^Do(Z,7) = I (R,D) : >max (/(Z; Uycl^n+i, T), /(Z; UyclUyc^, K+i, T)) 
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+ J2 HYi-, Ui\Z, Yn+i, W, T) for all ICCX, 

and Dkj > max E[dj{Yo, Yjc, y„+i, Vj)] for j = 1, . . . , J \ . 

IC:\IC\=k I 

Then define 

Theorem 12. TZV^ C TZVo. 

Proof: See Appendix |Ij ■ 
The proposed bound differs in two ways from the bound in [|29l as follows. Whereas the bound 
in [|29ll lower bounds the sum rate of a subset /C of messages by I(Z;Uic\UK:<',Yn+i,T), the 
proposed bound potentially improves upon it by taking the maximum of I{Z; U/c|U/cc, 5^n+i, T) 
and I(Z; U/c|5^n+i, T). Moreover, the proposed bound imposes distortion constraints for source 
reproductions based on subsets of messages, rather than only for reproductions based on all of 
the messages. These improvements were needed in order to use the bound to prove our converse 
result for the robust CEO problem: the robust CEO problem requires the decoder to be able to 
reconstruct the source sequence from a subset /C of the encoded messages, subject to a distortion 
constraint, without having any knowledge about the messages in IC^. The outer bound in 1291 
cannot be applied to this problem, since, as mentioned earlier, it lower bounds the sum rate of the 
subset of messages /C by /(Z; UycjUyc^, ^n+i, which involves conditioning on the messages 
in JC". 

Although we apply our improved outer bound to the robust binary erasure CEO problem in 
this paper, we believe that it could potentially be useful for other instances of the multi-terminal 
source coding problem. 

B. Optimal Distortion Tradeoffs for Robust CEO 

As defined earlier, the robust binary erasure CEO problem is an instance of the general multi- 
terminal source coding problem in which the hidden source Yq takes values in X = {+,—}. 
There is no side information and the decoder is interested in reproducing an estimate Vi of 
the hidden source Yq only. In order to be consistent with the notation used in the beginning of 
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this section, we shall henceforth use X instead of Yq and X instead of Vi . Here X takes values 
in {+, — , 0}. We begin with a few lemmas. 

Let g{-) denote the function on [p, oo) defined by 



- p<x<l 
X > 1. 

The following corollary and lemma, which we state without proof, are from [|29 



Corollary 1. Il29l Corollary 1] The function g^y^^"") is non-increasing and convex in y on 
[p", oo)- 

Lemma 2. ||29l Lemma 6] Suppose p'^ < D and (U, X) is such that 

(i) E[d\X,X)]<D, 

(ii) Ui ^ (X, Yjc, Ujc) for all i G Af, and 
(Hi) (X, Y) ^ U ^ X. 



32n /2Dy/"^ 1 



then 



1 " S 
- J2 li^u Um >g[{D + + 25 log -. 



n 



1=1 



For the robust binary CEO problem, let X]^ be the source reconstruction when the subset K, 
of messages is received at the receiver. We have the following lemma: 



Lemma 3. Suppose p^ < D and that (U, X, X^,, Y, W, T) for all /C, |/C| 

(i) (X, Y, Uycc, W) ^ (Uyc, T) o X^, 

(ii) Ui ^ (Fi, W, T) ^ (X, Yic, U^c) for all i G Af, and 

(iii) ^,J:^^^nY,■,U,\X,W,T)<giD^/'). 

Let D = maxic:K=eE[d^iX,Xic)]. For S G (0, 1/2], if 



is such that 



X > max 



32i 



2£ 



then 



5p{l-p) 
D>D-aD,6) 



D 

J 
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for some continuous ^ > satisfying ^{D, 0) = 0. 

Proof: See Appendix |jj 
We now prove our main result. Define 



7^,(D,A) = inf{i?: G7^I?o(A)} 



where TZVo{\) is the region given by Definition 11 when the distortion measure is d}. 

It was shown in [|29l Section 3.2] that the sum rate of the binary erasure CEO problem with 
n encoders, given a distortion constraint i^ 

n 
i=l 

It follows from this result, that for symmetric descriptions, if the distortion constraint for every 
subset of k messages is and every message has rate R, then the sum rate for any k descriptions 
is given by 

kR={l-Dk) + k-g{Dl), 



which implies 



R 



+ 9iD. 



(4) 



Theorem 13. // {R, Di, . . . , Dn) G UVceo, and 

Dk = inf |d : (i?, 1, 1, . . . , 1, D, 1, . . . , 1) e WDceo], 



I.e., 



then 



R 



De > (Dk)^ for all i > k. 



Proof: It suffices to prove Theorem 13 for a single subset of messages of size i > k. Fix 
6 E (0, 1/2], and suppose A satisfies 



A > max 



32i fDk^"^ 



6p{l -p) 



6 



^AU logarithms and exponentiations in 1291 have base e. Therefore the corresponding sum rate expression in 1291 is Ri 
(l-D„)log2 + n -5(1)1). 
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It follows from taking Z = X in the definition of lZT>o{Z^'^) (Definition 11) and from the 
monotonicity of 7^o(D, A) with respect to A that there exist i? e and 7 G To such that, for 
all subsets /C of size k, 

Du + 5>E\d\X,X^)l and 



kR + 6> A;7^o(D, A) + (5 > /(X; Uyc|T) + ^ /(y^; ly, T). 



lj:m-Mx,w,T)<gm + '^ + l. 



(5) 



From and it follows that 

. i J: ky, u.ix, WX) < + ,(Dh . l- (6) 

ieK. 

Now by the data processing inequality, 

/(X;Uyc|T)=J(X;Uyc,T) 
>I{X-Xic). 

Let e = 1(X ■ = -1)- We then have 

I{X;Vic\T) > H{X) - H{X\X,c) 
= l-H{X,e\X^) 

= l-H{e\Xic)-H{X\e,X,c) 

> 1 - h{Dk/\) - Pi{Xk = 0) 

> {l-Dk)-h{6). 

Using this and (pi), we can upper bound | Xlie/c H^i'^ Ui\X, W,T) as follows: 



(7) 



We will show 

J J2 HYf, U,\X, W, T) < g{Dl) + + ^ i > k. (8) 

i=l 

Suppose the Ui are ordered according to the mutual informations I{Yi;Ui\X,W,T), i.e., we 
have an ordered list of messages f/i, . . . , f/^ in which, for all i, j G {1, ...,£}, ?7j and Uj are 
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such that I{Yi;Ui\X,W,T) < I{Yj;Uj\X,W,T) when i < j. The last k elements of this list, 
Ui-k+i, ■■■,Ui, must satisfy 0, i.e., 

^ J2 I{Y,;U,\Yo,W,T)<g{Dl) + ^+^-. (9) 
i=e-k+i 

All other elements in the list yield equal or strictly smaller mutual informations. Therefore, if 
we average over a larger subset of messages, the average will never increase. We thus have 

-^J2l{Y,-U,\X,W,T)<^ J2 I{Yf,Ui\X,W,T). 

i=l i=£-k+l 

Using this and (|9]), we obtain ([8]). Define 

{Dk - C(A, 5))^ = 9^' (^9{Dh + ^ + 
for some continuous C > satisfying C{Dk, 0) = 0. We then have 

1 ^ 

- J2 Hy^■, U^\X, W, T) < g{{Dk - ({Dk, 5))^). (10) 
^ i=i 

From (10), we obtain, by using Lemma [3} 

D,>iDk-aDk,S))i-aD£,S) 

for some continuous > satisfying ^(Di,0) = 0. The proof is completed by letting A — > oo 
and then 6^0. U 

Appendix A 
Preliminaries 

We define a multi-letter mutual information as follows: 

Ik{XuX2;...;Xk) = D . . . , j 

K 



J2H{X,)-H{Xr,...,XK) 



i=l 



In particular, Ii{X) = 0. The multi-letter mutual information, as defined above, is a measure 
of the mutual dependence among K random variables and is different from McGill's multivariate 
mutual information [|26l . We note the following properties of /^-(Xi; X2; . . . ; Xk)- 

1) iKiXi; . . . ; X],) = E.=i HiXl) - H{Xl . . . , X],) > 0. 
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2) Ik{Xi', • • • ; ^k) > Imi^l] ■ ■ ■', Xm) + I{K~m+l)ifiXl, ■ ■ ■ , ^m)', ^m+l', ■ ■ ■ ^k), WhcrC 

f{Xi, . . . , Xm) is a function of the random variables Xi, . . . , X^, m < K. 
Remark: This property holds by symmetry for the general case when /(■) is a function of 
any size-m subset of Xi, . . . , Xk- 
Proof: 

Ik 

[Xi] . . . ; Xk) 

m K 

= ^ H{Xi) + ^ H{Xi) — H{Xi, . . . , Xm) — H{Xm+i, • • • , Xk\Xi, . . . , X^) 

4=1 i=m+l 

K 

= Im{Xl\ . . .;Xm) + -f^(Xj) — i7(Xm+i, . . . , X/^ |Xi, . . . , Xm) 

i=m+l 

= -^m(-^i; • • • ; -^m) + -f^(-'^i) — -f^(-^m+l5 • • • 5 -^i^l-^l, • • • 5 -^m, • • • 5 -^m)) 

j=m+l 

j=m+l 

= -^m(-^i; • • • ; -^m) + I{K~m+i){f{Xi, . . . , X^); X^+i; . . . ; Xk), 

where the solitary inequality holds because conditioning never increases entropy. ■ 

3) /^(Xi;X2;...;X,;...;X;,) > /;,(Xi; X2; . . . ; /(X,); . . . ; X^), where /(X,) is a func- 
tion of the random variable Xj. This is the data processing inequality for the multi-letter 
mutual information and is a special case of Property 2. 

Appendix B 
Proof of Theorem [2] 

The proof of the first part of Theorem |2] is simple. Let Dk > I — ^. No excess rate for every 
k descriptions implies that every description has rate R. If the decoder receives m descriptions, 
then it receives a sum-rate of mR bits per source symbol. Using the point-to-point rate-distortion 
function for a binary source with erasure distortion, we get Dm > 1 — mR. 

The proof of the second part of Theorem [2] is less trivial. We begin with a lemma. 

Definition 12. Let X be a binary random variable taking values in X. An erased version of X 
is a random variable X, taking values in X, such that Pr(X = +,X = — ) = Pr(X = — ,X = 
+) = 0. 
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Lemma 4. Let Xi , . . . , X„ be erased versions of a uniform binary random variable X taking 
values in {+, -}. If{l-\Y<\ andhiXs-,; . . . ;X^J = W S = {si, . . . , Sk}, S cM, \S\ = 
k, then ^"^^ Pr(Xi = 0) > n - 1. 

Proof: (l — ^)'^ < I ^ (^) > 1 — ^- We have the following four cases: 
Case I: There exists i G Af such that Pr(Xj = +) > and Pr(Xi = -) > 0. 
Assume i = 1 without loss of generality. Since Xi,...,X„ are erased versions of the same 
variable, they can never disagree in the source symbol they reveal (i.e., if Xj = + for some 
i E N, then the rest cannot be — , and if Xj = — , then the rest cannot be +). Thus Pr(Xi = 
+,Xj = — ) = 0, j G {2, ...,n}. Since /^(Xs^; . . . ; X^^) = for any set of k variables 
containing Xi and Xj, Xi and Xj must be independent. Thus 

Pr(Xi = +) ■ Pr(X,- = -) = Pr(Xi = +, X, = -) = 

^ Pr(Xj = -) = 0. (11) 

Likewise, Pr(Xi = -,Xj = +) = ^ Pr(Xj = +) = 0. Thus Pr(Xj = 0) = 1 and so 
Er=i Pr(X. = 0) > n - L 

Case II: There exists i E U such that Pr(Xi = +) > and Pr(Xj = — ) = 0, and Case I does 
not hold. 

Let 5 = { Si, . . . , Sfc} be a size- A; subset of M. For all T C S, denote by Ej- the event that 



X,^ = -\f Sj eT, and X,^ = V sj ^ T, sj E S. Now since Pr(X,^ = -) = from ([TT]), 
Pr(Er) = V r 7^ 0. Thus 

Pr(X = -) < 5^Pr(Er) 

rcs 

= Pr(X,, =X,, = ...=X,, = 0). (12) 



Since Pr(X = — ) = 1/2 and (X^^, . . . ,Xs^.) are independent, (12) yields 

k 



nPr(X,^. = 0) = Pr(X,, = X,, = . . . = X,, = 0) > ^ 
i=i 

In order to lower bound XliLi ^^i^i = 0), we solve 

min EJ=i Pr(^i = 0) 

ll, Pr(X.^=0_2 



s-t. nL Pr(X,^=0)>J V5 = {si,...,sJcA/'. 
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This is a convex optimization problem, as can be readily seen by substituting aj = logPr(Xj = 

1 

0), and can therefore be solved by choosing Pr(Xj = 0) = (^) for j = l,...,n. Thus 

^J^iPr(Xj = 0) > n {\)^ > n{l - 1/n) = n - 1. 

Case III: There exists i G Af such that Pr(Xj = -) > and Pr(Xj = +) = 0, and Case I does 
not hold. 

This case is symmetric to Case 11. 

Case IV: For all t e M, Pr(Xi = +) = Pr(X, = -) = 0. 

We have P^^i = 0) > EJ=2 P^^i = 0) = n - 1. ■ 
We are now in a position to prove the second part of Theorem |2j Let Dk < I — ^, Dk 
rational, and (l — ^)'' <|, and let /j, i G A/" and Qk, ^ A/", /C 7^ be a code that achieves 
the rate-distortion vector (i?, Di, . . . , D^, . . . , Let /j, i E M have rate R. We have 

lR>H{f,), zgA/". 

Let be the reconstruction when the source is reconstructed from a set S of descriptions. 
Then V S = {si, . . . , s^} C Af,\S\ = k, we have 

H{fs,...f,^)>H{X's)=l{l-D,). 

Thus 

• • • ; /sj = ^ -^^(/sj) - H{fs^ . . . /sj 

i=i 

< klR-l{l - Dk) = 0. 

Let Xg. be the reconstruction when the decoder receives the sf^ description only. Then 
4(Xi^;...;XiJ < 4(^;...;/,J = (Property 3) and so /.(X,,,^; . . . ; X,,,*) = 0, t G 
{1,...,/}. By Lemma g ELi P^^it = 0) > (n - 1) for t G {1,...,/}. Thus 

t=l 4 = 1 

^max( y VPr(X,t = 0) I > 1 - -. 

This completes the proof. 
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Appendix C 
Proof of Theorem [3] 

We establish two lemmas before proving Theorem |3] 

Lemma 5. Let Xi, X2, and X3 be Bernoulli random variables such that I{Xi; Xj) = 0, V i, j G 

{l,2,3},i ^ j, and Pr(Xi = X2 = X3 = 0) > |. Let p = max(Pr(Xi = 0),Pr(X2 = 0)). Then 

^ ^ ^ - 2 2j9 - 1 
Proof: If j9 = 1, then the conclusion follows directly from the hypothesis, so suppose that 
p < 1. Let Pi denote Pr(Xj = 0), p{xi,X2,X3) denote Pr(Xi = xi,X2 = X2,X3 = X3), and 
Px3\xux2 denote Pr(X3 = a;3|Xi = a;i,X2 = X2). Let go = Po|o,o, qi = Po|o,i, and q2 = Po|i,i- We 
thus have p(0,0,0) = pip2go, p(0, 1, 0) =pi(l-p2)gi, andp(l, 1,0) = (1 -pi)(l -p2)g2- Then 

Pr(Xi=0,X3 = 0) = p(0,0,0)+p(0,l,0) 

= pi(p2go + (1 -P2)gi) (13) 

Pr(X2 = l,X3 = 0) = p(0,l,0)+p(l,l,0) 

= (i-P2)(Pigi + (i-Pi)g2). (14) 



Since (Xi,X3) and (X2,X3) are pairwise independent, we have, from ([13]) and ([14]), 

Pr(Xi = 0, X3 = 0) = P1P3 = pi{p2qo + (1 - P2)qi) 

^P3 = ^2^0 + (1 -P2)gi, (15) 

Pr(X2 = 1,X3 = 0) = (1-P2)P3 

= (1 -P2)(pigi + (1 -Pi)g2) 

= PiQi + - Pi)q2- (16) 



From (15) and (16), 



Piqi + {1 - Pi)q2 = P2go + (1 -P2)gi 



q2 = • (IV) 

I- Pi 
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Since p{0, 0, 0) > 1/2 by hypothesis, we have piP2 > 1/2, and thus pi+p2 — l>0. Now since 
g2 < 1, ([1V]> gives 

1 ^ P2qo - {Pi +P2- l)gi ^ . P2qo - (1 - Pi) 

I- Pi PI+P2-1 



Now 



p(0, 0, 0) = pip2qo P2qo > 7^-- (1^) 

2 2pi 



Assume without loss of generality that pi > P2- Then pi + P2 < 2pi. Substituting this and (19) 
into ([18]) yields 

,,>i^i±^ = ^-^. (20) 
2pi-l 2pi-l 2pi 



Upon substituting ( [T9| ) and ( |20| ) into ( |T5) ), we get 

1 

P3 > 7^ + {l-p2) 
2pi 

1 ^ Pl(l -Pl) 



Pi 


1 


2pi-l 


~ 2^ 


Pi 


1 


2pi-l 


"2^ 



2 2pi - 1 

where the last inequality follows because p2 < pi and 2pi-i ~ 2^ ^ ' 

Corollary 2. Let Xi, X2, and X4 be Bernoulli random variables such that I{Xi;Xj) = 0, 

V j G {1, 2, 3, 4}, i ^ j, and Pr(Xi = X2 = X3 = X4 = 0) > |. Then 

4 

5^Pr(X, = 0)>3. 

1=1 

Proof: Let = Pr(Xi = 0). Assume WLOG that pi > P2 > Ps > Pi- Now psp^ = 
Pr(X3 = X4 = 0) > 1/2 by hypothesis, which implies p-^ > l/\/2 and p^ > l/2p3. Applying 
Lemma |5| to X2, X3, and X4 gives P2 > i + Thus 

4 

y^^Pi =Pi+P2+P3+Pa 
1=1 

> 2p2 +P3+PA 

( 1 , p,{l-p,) \ , ^ 1 



DRAFT 



37 



, 1 x(l-x)\ 1 
> min 2 max x, — \ + x -\ . 



Since ^ + ^^2p3-f monotonically decreasing in for G (1/2, 1], it is easy to verify that 



1 x(l-x)\ \ X if X > I + ^ 

max la;,' ^ \ - J z 



12 



2 2x -1 J 1 -L ^(i-'^) if y < 1 I 1 

^ I 2 2x-l — 2 



where ^ + is the admissible solution to the equation x = ^ + ^2l._l^ ■ Thus 

X{l-X)\ 1 . o 1 

H + X -\ , mm 2x + X -\ 

2 2x-l J 2x'.e[i+-i5,i] 2x 



> », > mm mm 2 - 



1 X 1 

min I min 1 H 1 , min 3x + 



a;e|-k,^+^l 2x 2x-l xe[|+^,i] 2x 



= min(3, 3) = 3, 

where the penultimate inequality follows from the fact that 1 + 2^ + 2a>^ ^ monotonically 
decreasing in x for x G [;^, | + and takes a minimum value of 3 at x = | + and that 
3x + ^ is monotonically increasing in x for x G + 1] and takes a minimum value of 3 
atx=i + -l^. ■ 



Lemma 6. Let Xi, . . . , be erased versions of a uniform binary random variable X taking 
values in {+, — }. If I{Xi; Xj) = 0, i,j G {1, . . . ,4}, i ^ j, then 

4 

5^Pr(Xi = 0) >3. 

1=1 

Proof: We have the four cases as in the proof of Lemma |4j 
Case I: There exists i G {1, 2, 3, 4} such that Pr(X, = +) > and Pr(X, = -) > 0. 
Just as in the proof of Lemma 4, we have J2'j=i = 0) > 4 — 1 = 3. 

Case II: There exists ? G {1, 2, 3, 4} such that Pr(Xi = +) > and Pr(Xi = -) = 0, and Case 
I does not hold. 

Assume i = 1 WLOG. Then from ([11]), Pr(Xj = -) = for j G {2,3,4}. Thus the Xj are 
effectively binary random variables such that Pr(Xi = . . . = X4 = 0) > 1/2. By Corollary |2} 

E5=i Pr(X, = 0) > 3. 

Case III: There exists i G {1,2,3,4} such that Pr(Xi = -) > and Pr(X, = +) = 0, and 
Case I does not hold. 
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This case is analogous to Case II. 

Case IV: For all i e {1, 2, 3,4}, Fi{Xi = +) = Pr(Xi = -) = 0. 

We have E'=i P^^j' = 0) > E5=2 = 0) = 4 - 1 = 3. ■ 

We are now in a position to prove Theorem [3j Let fi,iEN' and g/c, /C C A/" be a code that 
achieves ( ^~^^^ , Di, D2, -D3, D4). Using the same argument as that in the proof of the second 
part of Theorem |2| we have for i,j e {1, 2, 3,4}, i 7^ j that Xj) < /(/j; /,) = and thus 
I{Xit; Xjt) = for all t G {1, . . . , /}. By Lemma § J2t=i ^^i^u = 0) > 3 for t G {1, . . . , /}. 
It follows that 

/ 4 



t=i 1=1 



max 



(^y|:Pr(X,, = 0)j > 



3 
4' 



This completes the proof. 



Appendix D 
Proof of Theorem |4] 

We establish two lemmas before proving Theorem |4j 

Lemma 7. Let Xi,. . . ,Xn be Bernoulli random variables such that I{Xi;Xj) = V i,j G 
A/", i ^ j, and Pr(Xi = X2 = . . . = X„ = 0) > |. T/zen 



^X;Pr(X, = 0)>l 



n 



Proof: Let pj denote Pr(Xj = 0) and let Qi = Pr(Xj = 1) = 1 — pi. Since the Xj's are 
pairwise independent, we have 



E 



Var 



n n 

n ^-^ n ^-^ 

i=l J i=l 



i=l 



i=l 



1=1 



Let a > \ \^^^=iPi(li)- Then, by Chebyshev's inequality. 



Pr 



n ^ — ^ n ^-^ 

1=1 1=1 



> a\ < 



DRAFT 



39 



Let Ex and i?2 be the events | \ YTi=\ ~ n X)r=i 9* I < « and Xi = X2 = 
respectively. Then Pr(_E'i) > ^, and Pr(£'2) > | by hypothesis. Since Pi{Ei] 
Pr(Ei n E2) > 0. This implies that 



^ n 1 ™ 

— / gj < a ^ — ^ Pi > 1 — a. 



Since a was arbitrary, this implies 



n 

-E 



> 1 



1=1 



Moreover, 



^ n 1 " 

n ^-^ n ^-^ 

i=l i=l 



\ 



i=l 



A little algebra gives 



i=l 



\| i=i 



i=l 



Substituting (22) into (21) yields 



n ^-^ 

1=1 



2 2 

4-2 = i--. 



• = X„ = 0, 
Pr(^2) > 1, 



(21) 



(22) 



Lemma 8. Let Xi, . . . ,X„ be erased versions of a uniform binary random variable X taking 
values in {+, — }. If I{Xi; Xj) = 0, i,j G Af, i 7^ j, then 

n 

^Pr(Xi = 0) >n-2. 

1=1 

Proof: We have Cases I, II, in, and IV as in the proof of Lemma |4} Cases I and IV are 
the same as those in Lemma |4} so we will just mention Cases II and III. 
Case II: There exists i e J\f such that Pr(Xi = +) > and Pr(Xi = -) = and Case I does 
not hold. 

Assume i = 1 WLOG. Then from ([TT]), Pr(Xj = -) = for j G {2, . . . Thus the X/s 
are always erased when the binary source X = —, and so Pr(Xi = . . . = X„ = 0) > 1/2. By 
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Lemma |7| Yll=i P^^(-^i = 0) > n — 2. The proof of Case III is analogous to the proof of Case 
11. ■ 

We are now in a position to prove Theorem |4j Let fi,iEM and gjc, /C C A/" be a code that 
achieves {^—j^,Di,D2, . . . ,Dn). Using the same argument as that in the proof of the second 
part of Theorem [2| we have for i,j G Af, i j that < I{fi]fj) = and thus 

I{Xu; X,t) = for t G {1, ...,/}. By Lemma ^ ELi P^^^t = 0) > n - 2 for t G {1, . . . , /}. 



It follows that 



This completes the proof. 



I n 



max 



t=i 1=1 



Appendix E 
Proof of Lemma [I] 

For any tG{l,. ..,/}, we have exactly one of the following four cases: 
Case I: 3 z G AT s.t. Ft{Xu{X) = +) > and Ft{Xu{X) = -) > 0. 

Case IT. 3 i e Af s.t. Ft{Xu{X) = +) > and Pr(X,j(X) = -) = 0, and Case I does not 
hold. 

Case III: 3 i e Af s.t. Fi{Xit{X) = -) > and Fi{Xit{X) = +) = 0, and Case I does not 
hold. 

Case IV: V z G AT, Pr(Xit(X) = +) = Ft{Xu{X) = -) = 0. 

Let Bi, B2, B3 and be the sets of t G {1, . . . , /} satisfying Cases I, II, III and IV, respectively. 
Moreover, let = bi, \B2\ = 62, l-Bs] = 63 and \Bi\ = 64. Then 61 + 62 + ^3 + &4 = l- Now 
consider a source string (x*)' such that = — if t E B2 and = + if t E B^. We have 



max > 



i=l 



1 ' 

-^Y^d{xuXit{x)) 
^ t=i 



n ^ L 

1=1 t=i 



^ n ^ n 



teBi i=l 



teB2 i=l 
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^ n ^ n 

tSBs i=l teBi i=l 

Consider now t E Bi. Since Xit{X), . . . , X„t(X) are erased versions of the same binary random 
variable Xt, they can never disagree in the source symbol they reveal. We therefore have 

Ft{Xu{X) = +,Xjt{X) = -) = 0, J G A/", J ^ I. Since Xu{X) and X,t{X), t,j G M 
z 7^ j, are pairwise independent, we have Pr(Xjf(X) = +) ■ Pi{Xjt(X) = — ) 

= Pr(X,,(X) = +,X,,(X) = -) = 

Pr(X,,(X) = -) = 0, (23) 

since Pr(Xjt(X) = +) > 0. Repeating the same analysis with Pr(Xjf(X) = -,Xjt{X) = +) 
yields Pr(Xjt(X) = +) = 0. Thus Pr(Xjt(X) = 0) = 1 for all j e j ^ i, and therefore 
= for all j e Af, j i. Similarly, it follows from (|23]) that Pr(Xjt(X) = -) = 



for j E Af, j ^ iif t E B2 and Pr(Xji(X) = +) = for j G TV, j ^ i if t E B3. Thus by 
construction, Xl{x*), i E M, must have Xit{x*) = for t G U iSs U i34. It follows that 



max > 



i=l 



j^d{xt,Xit{x)) 



^ y E E i(^.t(^'*)=o) + y E E i{^«t(^*)=o) 

teBi i=i teB2 i=i 

^ n \ ^ 

+ y E E i{^.t(^*)=o) + y E E i(^.t{^*)=o) 

> y6i(n - 1) + y&2'n- + y&s""- + y&4'n' 
= y(n/-6i) 
= ri — ^^'T- — 1- 
This completes the proof. 

Appendix F 
Proof of Theorem [7] 

Let Dk < 1 — ^ and rational. Let fi, i E M and gjc, IC C M, JC f/}, he a code that achieves 
{R, Di, . . . , Dk, . . . , Let R be the rate of /«, i G A/". Consider endowing the source with 
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an i.i.d. uniform distribution over X'- for analysis purposes. Then for all i E J\f, 

IR > H{fi). 



(24) 



Let Xg be the reconstruction when the source is reconstructed from a set S of descriptions. 
Then \f S = {si, . . . , Sk] C A/", \S\ = k, we have 

H{f,, ...fs,)> H{X's) > I{X'- X's) > 1(1 - Dk), 

where the final inequality follows because the average distortion is no lower than the worst-case 
distortion. Thus 

k 

• • • ; fsk) = H{fs^) - H{fs^ . . . /sj 



< klR-l{l - Dk) 



0. 



(25) 



Let Xl_ be the reconstructed source string when the decoder has access to the s*'^ description 
only. By Property 3 of the multi-letter mutual information, /^(X^^; . . . ; X[^) < Ik{f si', ■ ■ fsk) = 
for all 5 C A/", l^l = fc. By Property 2 of the multi-letter mutual information, I{Xl, Xj) = 
for all i,j G Af, i ^ j, and thus I{Xit; Xjt) = for all i,j G Af, i ^ j, and t = 1, . . . , /. Now 
if any two of the X'^ disagree in a source symbol they reveal, then the resulting single-message 
distortion is going to be oo and the result follows trivially, so suppose that the are consistent. 
Then by Lemma [T] we have 



which implies 



Emax 

i=l 



Di = max max 



1 ^ 



t=i 



1 ' 

-^d{xt,Xi, 



t=i 



> n 



n n 



This completes the proof. 



Appendix G 
Proof of Theorem [8] 



Consider R first. If i? < ^— then the sum rate of any k descriptions is strictly less than 
1 — Dk, and the source string cannot be reconstructed with distortion Dk- Thus the rate of each 



description must be at least 



. Now, in light of the previous theorem, it suffices to show 
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that for any {R, D^, . . . , Dk, . . . , D^) e UV^orsu if /^i = 1 - ^, then D^>l-^ioxm<k. 
Let S = {si, . . . , Sk} and M = {si, . . . , Sm}- Let X^j^ be the source reconstruction when the 



decoder has access to set of descriptions indexed by the elements in M.. Then from ( [25] ) and 
Properties 2 and 3 of the multi-letter mutual information, it follows that 



SkJ 



< hifsi-, • • • ; /sfe) — 0, 

and thus /(X^ X^^^^ t; • • • , -^s^.,*) = for t = 1,...,/. This implies that for each t, the 
(n — m + 1) random variables {XM,t] • • • ; -^s„,t} are pairwise independent, and therefore 

by Lemma [T] 



max 



1 ' 

-^d{xt,XM,t) 
t=i 



Since Di = 1 — -, we have 



max 



Emax 

i=m+l 



1 ' 

-^d{xt,X,^^t 
t=i 



1 ' 

-^d{xt,Xs^, 



t=i 



< 1 



> n — m. 



n 



for m + 1 < 2 < n, and thus 



max 



1 ' 



t=i 



> n — m — > max 

i=m+l 



>n — m — fn — m) l 



n — m m 
/? n 



1 ' 

-^d{xuXs^,t 



t=i 

1 

n 



which implies 



Dra = max max 

\M\=m' 



1 ' 

-^d{xt,XM,t] 



t=i 



> 1 - 



m 

n 



This completes the proof. 



Appendix H 
Proof of Theorem [9] 

Since m divides n, we can form n/m sets consisting of m messages each. Denote these sets by 
Mi,...,Mn/m, where A^i C \Mi\ = m, and MiHMj = ^,i,j G {l,...,n/m}. 
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i 7^ j. Since m < k/2, there exists a set S* = {si, . . . , s^} of k messages containing Aii and Aij 
for some i, j G {1, . . . , n/m}, i ^ j. Let be the source reconstruction when the decoder 
has access to the messages in Aii only. By Property 2 of the multi-letter mutual information, it 
follows that for the set S containing Aii and Aij, 

H^Mi'^ ^Mj) — A'=-2m+2)(-^jwi; ^Mj'^ fr] ■ ■ ■] fr+k-2m-\) 

where /,.,..., fr+k-2m-i ^ {fsi, ■ ■ ■ JsJ \ {Mi, Mj}. By Lemma [l] we have 

'l ' 



n/m 

Emax 

i=l 



t=l 



> 



n 



m 



and thus 



max max 

McJV x^exi- 

\M\=m 



> max max 

ie{l,...,n/m} xl-eX' 



1 

1 ' 

-^d{xt,XMa) 



t=i 



n _ 1 

> ^ = 1 



m 
n 



This completes the proof. 



Appendix I 
Proof of Theorem [T2l 

This bound differs only slightly from the outer bound proposed in [|29ll and much of the 
proof is similar to that in [|29l . Suppose (R, D) is achievable. Let \ . . . , fn^ be encoders and 
((7^)', /C C A/" be decoders satisfying Take any Z in and augment the sample space 
to include Z^ so that (Z^, Fo.t, Y^i, y„+i t) is independent over t E {1,. . . ,/}. Next let T be 
uniformly distributed over {!,...,/} and independent of Z'-, Yq, Yj^ and Yl^_^^. Then define 

Z = Zj^ 

Yi = Yi^T for z G A/" 
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= V,-Tforj = l,...,J 

w = {{z^]\{Zr]AyUi}\{yn^i,T]). 

It can be verified tliat 7 = (U_^^, Vi, . . . , Vj, W, T) is in To and that, together with Yq, Y^, Yn+i, 
and Z, it satisfies the Markov coupling. It suffices to show that (R, D) is in TZVo{Z,'j). Note 
that (|3]) implies 

Dk,j > max E[rfj(Fo,T, Y/c,t, >^n+i,T, V,-,t)] for j = 1, . . . , J, 

K.:\K.\=k 

i.e., 

Dfcj > max E[dj{Yo, Yfc, F„+i, V,)] for j = 1, . . . , J. 

A;:|/C|=A: 

Second, by the cardinality bound on entropy and the fact that conditioning never increases 
entropy, 

= /(z',Yj,;(/,(r/))^^^|F^^,). (26) 
By the chain rule for mutual information, 

/ [Z\Y^^; (/.(F/)),^,^ \yJ,^,) = I [Z\ {MY!))^^^ + I (yJ,; (/.(F/)),,^ . 



The rest of the proof is similar to that in [|29l . The main difference between this proof and the 
proof in [29] is that here we do not condition on {fi{Yl)).^^^ in ([26]). Taking the maximum 
over this bound and the bound in [|29ll yields the desired outer bound. 

Appendix J 
Proof of Lemma [3] 

Assume WLOG that )C = {1, . . . , i}. For each possible realization {w, t) of {W, T), let 

D^^t = E[d^{X,XK)\W = w,T = t\. 
Let S = {{w,t) : Duj^t < V^}- Then by Markov's inequality. 



Pr((iy,T) iS)<^<5. 

v A 



(27) 
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In particular, Pt{{W,T) e S) > 0. Also, for any {w,t) e S, 



32i f2D^t 



p{l-p) \ A 



i/i 



< 5. 



Thus, by Lemma [2| if {w, t) E S, 



1 ^ 

- J2 ^(^- U,\X,W = w,T = t)>g {{D^^t + + 26 log 



i=l 



By averaging over (w, t) E S and invoking Corollary [T| we obtain 



Pr((iy,T) G 5) 



(to,i)e5 i=l 

> g{{D + 6)'/') + 26log^-. 



Therefore, 



1 ^ 

-5^/(F,;f/,|X,l^,T)> 



i=l 



g{{D + 6y/') + 26\o^ 

> giiD + 5y/') + 26\o^ 
g{{D + aD.5)Yl') 



■Vi{{W,T) E S) 
(1-5) 



for some continuous ^ > satisfying ,^(i),0) = 0. It follows from this and constraint (Hi) of 
the lemma that g{D^^^) > g{{D + ^(Z), 5))^/^), and from the monotonicity of g{D^^^) in D 
(Corollary [T]), we obtain 

D + aD,6)>D, 



and thus 



This completes the proof. 



D>D-aD,5). 
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